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© The common conserved structural features of msDNAs are described. A synthesis of msDNAs is described 
which involves a necessary reverse transcriptase. Reverse transcriptases are described which have unique 
properties in the synthesis of cDNAs. Various utilities are described. 
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FIELD OF THE INVENTION 

The invention relates to the field of recombinant DNA. More particularly, the invention relates in a 
generic manner to a unique and unusual genetic structure, a multi-copy single-stranded DNA/RNA hybrid 
5 structure, herein designated as msDNAs. The invention also relates to reverse transcriptases (RT) which are 
capable of synthesizing a cDNA molecule from an RNA template in a unique manner. The invention also 
relates to a cell-free synthesis of msDNAs with the RTs. 

BACKGROUND OF THE INVENTION 

w 

Individual species of msDNAs and reverse transcriptases essential for their synthesis have been 
discussed in our pending patent applications, and in our publications. We have discovered that notwith- 
standing the great diversity of these msDNA species, msDNAs share essential, common and conserved 
structural and functional elements. The invention therefore relates to such DNAs whether known, individual 

75 species discussed in our earlier patent applications or other msDNAs to be identified in the future to the 
extent that they share in these common features. 

Until recently it had been commonly believed that retroelements which encode RTs are exclusively 
found in eukaryotes and that bacterial populations do not contain retroelements. The finding of retroele- 
ments in prokaryotes, the requirement of reverse transcriptase (RT) for msDNA synthesis has raised 

20 fundamental scientific questions regarding the possible origin and evolution of the retroelement encoding 
the reverse transcriptase RT, molecular mechanisms of msDNA synthesis, and the functions of msDNAs in 
cells. 

Novel findings have also been made regarding a possible mechanism of synthesis of the msDNAs by 
RTs. Thus, the studies carried out and associated discoveries have important scientific significance. 
25 The msDNAs have important utilities, as described hereinafter. These structures are therefore also 
significant from the practical point of view in molecular biology, medical, immunology and other applica- 
tions. 

United States patent applications relating to various msDNAs and RTs are the following: 

Serial No. 07/315,427 discloses a method for synthesizing various msDNAs in vitro. By this method a 

30 variety of synthetic msDNAs can be prepared in an efficient and practical manner. Serial No. 07/315,316 
discloses an msDNA molecule from a prokaryote, M. xanthus. This was a particularly noteworthy break- 
through in this series of discoveries. Serial No. 07/315.432 discloses an msDNA molecule from another 
prokaryote, E. coli. This invention contributed to the generic finding of msDNA structures whose synthesis is 
dependent on ~RT in prokaryotes. Serial No. 07/517,946 discloses prokaryote msDNAs synthesized from 

35 DNA fragments designated as retrons. Serial No. 07/518,749 discloses further msDNA molecules syn- 
thesized from recombinant DNA constructs, designated as, retrons. Serial No. 07/753,110 discloses a large 
variety of msDNAs synthesized in vivo in eukaryotic organisms such as yeast, plant cells and mammalian 
cells. 

For background art, one skilled in the art may refer to Dhundale, Cell, 51, pp. 1105-1112 (1987); Weiner 
40 et al., Ann. Rev. Biochem, 55, pp. 631-661 (1986); Yee et aL, Cell,~3& pp. 203-209 (1984); and Lim and 
Maai, Cell, 56, 891-904 (March 10, 1989). Other background references of interest may be found in the 
above referred to patent applications and are cited in the REFERENCES pages of this application. 

RELATED PATENT APPLICATIONS 

45 

This is a continuation-in-part of allowed U. S. application Serial No. 07/315,427, filed February 24, 1989, 
entitled "The Use of Reverse Transcriptase to Synthesize Branched-RNA Linked Multi-Copy Single- 
Stranded DNA", by Bert C. Lampson, Masayori and Sumiko Inouye; and of pending U.S. applications Serial 
Numbers 07/315,316, filed February 24, 1989, entitled "Reverse Transcriptase from Mycobacteria", by 

so Masayori and Sumiko Inouye, Mei-Yin Hsu, Susan Eagle; 07/315,432, filed February 24, 1989, entitled 
"Reverse Transcriptase from E. Coli", by Bert C. Lampson, Jing Sun, Mei-Yin Hsu, Jorge Vallejo-Ramirez, 
Masayori and Sumiko Inouye; also of 07/517,946, filed May 2, 1990 entitled "Prokaryotic Reverse 
Transcriptase, by Masayori and Sumiko Inouye, Bert C. Lampson, Mei-Yin Hsu, Susan Eagle, Jing Sun, 
Jorge Vallejo-Ramirez; 07/518,749 filed May 2, 1990, entitled "E. coli msDNA Synthesizing System, 

55 Products and Uses", by Masayori and Sumiko Inouye; also of 07/753/iTO filed August 30, 1991, entitled 
"Method for Synthesizing Stable Single-Stranded cDNA in Eukaryotes Means of a Bacterial Retron, 
Products and Uses Therefor", by Shohei Miyata, Atsushi Ohshima, Masayori and Sumiko Inouye. These 
applications are incorporated herein by reference. 
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Dhundale et al. referred to above, speculate about a possible synthesis mechanism for the synthesis of 
msDNA. The publication discusses a nucleotide fragment which is presumed to encode msDNA. Although 
the fragment contains portions of the elements necessary to code for msDNA, it does not contain an open 
reading frame to code for a reverse transcriptase (RT) which is necessary for the synthesis of msDNA. 

5 The present invention incorporates earlier disclosures in U.S. pending patent applications Serial No. 
07/315,427 filed February 24, 1989 entitled "Production of Branched RNA-linked Multi-copy Single-Stranded 
DNA using Permeabilized Cells" and other applications identified above. In these three first applications, 
there is disclosed all the DNA and RNA elements necessary to code for the entire msDNA molecule 
including the open reading frame which codes for the reverse transcriptase (RT) and when present, the 

w ribonuclease H (RNase H) domains. 

The discovery of the location of the open reading frame in the same DNA fragment as the gene 
encoding the RNA and DNA portion of the final msDNA molecule could not be foreseen at that time. This 
observation is further supported by a recent publication of independent researchers, Lease and Yee in JBC, 
266, 14497-14503 (August 1991) entitled "Early Events in the Synthesis of the Multicopy Single-stranded 

75 DNA-RNA Branched Copolymer of Myxococcus xanthus". The authors question that a reverse transcriptase 
alone, by itself, was sufficient to completely and directly synthesize msDNA on an RNA template. They 
propose an alternative model for the synthesis of msDNA. They propose a synthesis in which a single- 
stranded DNA corresponding to the DNA portion of the msDNA is first synthesized in a conventional 
manner by a 3' to 5 ' priming reaction; this DNA strand is then ligated to the 2'-OH group of the branched 

20 rG residue of msdRNA at its 5* end forming a 2', 5'-phosphodiester linkage. In contrast, the disclosure in 
the earlier patent applications identified above and the disclosure made herein clearly exclude this 
alternative model. It was found that the synthesis of msDNA-Ec67 is primed de novo by a single dNTP base 
using an RNA precursor molecule. Furthermore, the first deoxynucleotide addition as well as the extension 
of the DNA strand from the first base is absolutely dependent upon the template RNA sequence and RT. It 

25 is undoubtedly appears that msDNA is synthesized directly on an RNA template by reverse transcriptase. 
The 5' end sequence of the msr-msd transcript (bases 1-113) forms a duplex with the 3' end sequence of 
the same transcript, thus serving as a primer as well as a template for msDNA synthesis by reverse 
v$&00^ transcriptase. It appears therefore that the reverse transcriptases with which the group of researchers 

" " " * named in the earlier patent applications and herein have been working is unique. The reverse transcriptases 

30 are essential and capable by themselves to synthesize each of the entire msDNA molecules. The synthesis 
is initiated by a novel 2\ 5'-branched priming event on the folded msr template in which a dT residue is 
linked to the 2 , -OH of an internal rG residue of the msdRNA molecule. This is further described below. 



SUMMARY OF THE INVENTION 



35 



The invention relates to three main embodiments. The generic features of msDNAs; RTs which have 
the ability to synthesize cDNA from a template by a unique 3', S'-priming event; and a cell-free system to 
synthesize msDNAs with such RTs. 

The description of these embodiments presents two unprecedented aspects in molecular biology: first, 
40 the priming of cDNA synthesis from the 2'-OH group of an internal guanosine residue in the RNA strand 
and secondly, the existence of reverse transcriptase in procaryotes. 

The invention encompasses broadly a DNA/RNA hybrid structure which comprises a single-stranded 
DNA portion linked with and forming an integral part with a single-stranded RNA portion, herein designated 
as msDNA. The msDNAs are produced in several hundred copies from a genetic element identified herein 
45 as a "retron", and are therefore identified as multicopy, single-stranded DNAs or msDNAs. A generic 
representation of common features of the msDNAs of the invention is shown below. 

An important and valuable feature of the msDNAs is that notwithstanding their single-strandedness, their 
remarkable stability which makes them very well suited for several utilities. Of particular interest is the use 
of the msDNAs in antisense applications against the mRNA of a target gene encoding a protein, as will be 
so described hereinafter. 

It will be observed from the graphic generic representation of the common features of the msDNAs 
shown below that the msDNA is a molecule which is constituted of a stable hybrid branched RNA portion 
covalently linked to a single-strand DNA portion by a 2 , ,5 , -phosphodiester bond between the 2'-OH group of 
an internal rG residue and the 5*-phosphate of the DNA molecule, and non-covalently linked to the DNA by 
55 base pairing between the complementary 3* ends of the RNA and DNA molecules. In the msDNA molecule, 
RNA and DNA portions form one or more stable stem-and-loop secondary structures. The msDNAs are 
encoded by a single primary transcript, pre- msd RNA, which in turn is encoded by a genetic element called 
a retron. The retrons are genetic elements which contain a coding region msr for the RNA portion of the 
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hybrid molecule and msd for the DNA portion of the msDNA molecule, respectively and an open reading 
frame (ORF). The pre-msdRNA likewise comprises the ORF, the msr and msd regions. Synthesis of the 
msDNAs require the transcription of the region encompassing the msTTrrisd regions and the ORF. However, 
the ORF and the msr-msd regions do not necessarily have to be present in the same transcriptional unit. 
5 The generic structure of the msDNAs all possess this unique branched linkage forming the RNA and 
DNA strands. Further, the branched residue is in all cases, an internal guanosine residue in the 5' end of 
the RNA transcript. 

Another conserved feature of the msDNAs is the base pairing of the 3* ends of the DNA and RNA 
portions. A further conserved feature that codes for the msDNAs is a set of inverted repeats (IR) sequences 
w which are located as described hereinafter. The existence of the IRs is essential for the synthesis of the 
msDNA which contain the typical stem-loop structures. They allow the transcript RNA to fold into important 
secondary structures. 

From the description herein, it is to be noted that the generic representation of the msDNA of the 
invention provides optional common secondary structures, like the stem-and-loop structure, which if present 
75 is part of the ssDNA portion of the molecule and at least one stem-and-loop structure is part of the ssRNA 
portion of the molecule. Further, the msDNAs of the invention, may have different nucleotide lengths, both 
with respect to their DNA and RNA portions. Other variables of the msDNAs will become apparent from the 
. description that follows. 

The invention also relates to RTs which are capable by themselves to synthesize the entire msDNA 
20 molecule from a template starting with a priming event which forms a unique 2\5'-linkage between the 
template molecule and the first nucleotide at the 5' end of the cDNA strand. The RT has interesting 
practical applications. 

The invention also relates to a cell-free synthesis in which RT synthesizes cDNA from an RNA template 
and forms the entire msDNA structure. The cell-free system provides further confirmation of the unique 
26 property of the RTs. 

DESCRIPTION OF THE FIGURES 



FIG. 1 shows restriction map of pCl-1EP5, the proposed secondary structure of msDNA-Ec67 and a 

30 putative secondary structure of the precursor RNA molecule. Part A shows the restriction map of pC1-1EP5 
(Lampson et al., 1989b). The BssHI site changed to a BamHI site is also shown by an arrowhead and the 
Xbal site crealed by site-specific mutagenesis is shown by an arrowhead. Locations and orientation of msr 
and msd and the RT gene are shown by arrows. The regions cloned into p67-BH0.6 and p67-RT are 
indicated by open boxes, respectively. Part B shows the structure of msDNA-Ec67 (Lampson et al., 1989b). 

35 The branched rG is circled and RNA is boxed. Both RNA and DNA are numbered from their 5'-ends. Part C 
shows a putative secondary structure of the precursor RNA,molecule. The 5'-end of the RNA transcript was 
determined by primer extension (Hsu et al., unpublished results). The 3'-end of the RNA molecule is 
considered to form a stem structure using the inverted repeat sequence, a1 and a2 (arrows) in the primary 
RNA transcript (Lampson et al., 1989b). The branched rG is circled. Bases changed by mutations are 

40 indicated by arrows with individual designations. Open and filled triangles indicate the positions of the 3'- 
ends of RNA and DNA in msDNA-Ec67, respectively. 

FIG. 2 shows specificity of the priming reaction of msDNA-Ec67 synthesis in vitro. The reaction for the 
first base addition was carried out as described in Experimental Procedures; the reaction mixture contains 
an RNA fraction from a 1-ml culture and 5 uCi of each [a- 32 P]dNTP in separate reactions in 10 al of RT 

45 buffer. The reaction was started by adding 2 ul of the partially purified RT and the mixture was incubated at 
37 "C for 30 minutes. Lanes 1 to 4, the reaction was carried out with the RNA fraction from CL83 cells 
harboring p67-BH0.6 (wild-type); lanes 5 to 8 from cells harboring p67-mut-1 for the A to T mutation at 
position 1 18 in Fig. 1C (mutation 1); lanes 9 to 12, from cells harboring p67-mut-2 for the A to G mutation at 
position 118 in Fig. 1C (mutation 2); and lanes 13 to 16, from cells harboring p67-mut-3 for the G to A 

so mutation at position 15 in Fig. 1C. [a- 32 P]dNTP used for each lane is indicated on the top of each lane. The 
Mspl digest of pBR322 labeled with the Klenow fragment and [a- 32 P]dCTP was applied to the extreme left- 
hand lane as molecular weight markers. Numbers indicate sizes of fragment in bases. An arrowhead 
indicates the position of the precursor RNA molecule specifically labeled with dNTP for each RNA 
preparation. 

55 FIG . 3 shows schematic diagram of the production of bands a and b. Thin and thick lines represent 
RNA and DNA, respectively, and the arrowheads indicate the 3'-end. Open and filled triangles indicate the 
3'-ends of the RNA and DNA strands, in msDNA, respectively. Broken lines indicate base pairings in the 
double-stranded RNA structure at the 5'-end of msdRNA. Structure I is first formed from the primary 
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transcript from retron-Ec67. The unhybridized 3'-end is probably removed in the cells; the resulting 
structure is identical to that shown in Fig. 1C. When dTTP is added with RT-Ec67 in the cell-free reaction 
mixture, a dT is linked to the 2'-OH group of an internal rG residue (circled in the Figure) by a 2\ 5'- 
phosphodiester linkage (structure !l). When the other three dNTPs are added, the DNA strand is elongated 
5 along the RNA template. As the DNA strand is extended, the RNA template is concomitantly removed 
(structure III). The DNA synthesis is terminated at the position indicated by a solid triangle, leaving a 7-base 
DNA-RNA hybrid at their 3*-ends, yielding structure IVa or IVb. RNase A treatment of structure IV results in 
structure V. When structure V is incubated in a boiling water bath, structure VI is formed. Structures Via and 
Vlb correspond to bands a and b in Fig. 2, respectively. 
w FIG. 4 shows chain termination reaction during msDNA synthesis in the cell-free system. The extension 
reaction for msDNA synthesis was carried out in the presence of dideoxy NTP as described in Experimental 
Procedures. Individual chain termination reaction mixtures containing either ddGTP, ddATP, ddTTP and 
ddCTP were applied to lanes G, A, T and C, respectively. The resulting ladder was read at the right-hand 
side which corresponds to the DNA sequence of msDNA-Ec67 from base 24 to 54 (see Fig. 1C; Lampson 
15 et al., 1989b). The same molecular weight markers as in Fig. 2 were applied to the extreme left-hand lane, 
andlhe sizes in the bases are indicated at the left-hand side. Four major products are indicated by arrows 
with a, b, c and d as schematically drawn in Fig. 3. 

FIG. 5 shows ribonuclease treatment of the band b and d products. The products after the full extension 
reaction in the cell-free system (without ddNTP) were applied to a DNA sequencing gel (lane 2). Band b 
20 (lane 3) and band a (lane 4) were isolated from preparative gel electrophoresis, and digested with RNase A 
(lanes 5 and 6, respectively). The molecular weight markers (lane 1) are the same as in Fig. 2. 
FIG. 6 shows the complete nucleotide sequence of typical msDNAs. 
FIG. 7 shows another msDNA, Ec74. 
FIG. 8a and 8b show two other msDNAs, EdOO and Ec101 . 
25 FIG. 9 shows the protocol for constructing synthetic msDNA 100. 
FIG. 9b shows genes and components of synthetic msDNA. 
FIG. 10 shows the protocol for construction of msDNA 101. 
S^^^N FIG. 1 1 shows cDNA production obtained from RNA or DNA templates. 

30 DESCRIPTION OF THE PREFERRED EMBODIMENTS 



Generally, msDNA may be described as a molecule which comprises a branched single-stranded RNA 
portion which is covalently linked to a single-stranded DNA portion by a 2'-5'-phosphodiester bond between 
the 2'-OH group of a branched rG residue internal the RNA strand and the S'-phosphate of the DNA 

35 molecule. Other common features are a non-covalently linked DNA-RNA hybrid at the 3' ends which is 
formed by base pairing between the complementary 3', ends of the DNA and RNA molecules and stable 
secondary structures in both the DNA and RNA strands. 

The extreme 3' end of the DNA strand contains a sequence complementary to the sequence at the 3' 
end of the RNA strand. This allows the overlapping 3' ends of the DNA and RNA to form an RNA-DNA 

40 base-paired region. The presence of this short RNA-DNA hybrid is a result of the mechanism by which 
msDNA is synthesized via RT. 

The msDNA molecule exists free of the chromosome in the cell cytoplasm, and can be isolated by the 
same methods used to isolate plasmids). msDNA is stable in spite of the fact that the molecule consists of 
single-stranded RNA and DNA portions. This stability is believed to result from the branched structure that 

45 protects the 5* end of the DNA, the RNA molecule after the branched G residue and the 3' end DNA-RNA 
hybrid. Analysis of msDNA molecules reveals a large degree of nucleotide sequence diversity among them, 
with little if any, primary sequence homology in either the DNA or RNA strand. However, in spite of their 
structural diversity, all msDNAs share important common primary and secondary structures in common, as 
described herein. 

so The msDNAs of the invention are encoded by genetic elements designated as retrons. The retrons 
comprise three distinct regions: an msr region which codes for the RNA portion of the msDNA, an msd 
region which codes for the DNA portion of the msDNA and an open reading frame (ORF) which codes for a 
polypeptide having reverse transcriptase (RT) activity. In one of the msDNAs (msDNA-Ec67), the ORF 
codes for an RT which has ribonuclease H (RNase H) activity. It is not excluded that other msDNAs yet to 

55 be discovered will also be synthesized by RTs which contain an RNase H domain. The above-discussed 
three elements can occur in a single operon or the msr-msd region can be separate from the RT gene yet 
operates in concert with the RT gene to synthesize the msDNAs. The msr-msd region and the msr - msd 
region and the RT gene can be expressed under the control of a single promoter or the RT can be 
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expressed by a separate promoters. 

Transcription of the msr and msd region yields a primary transcript, pre-msdRNA. This primary 
transcript encompasses alTThree regions: the msr, msd , and ORF regions of the retron, as is further 
described below. 

The common features of msDNAs a) the 2', 5'-phosphodiester linkage between a G residue within a 
continuous RNA strand, b) stable secondary structures in the RNA and DNA portions and c) the RNA-DNA 
hybrid structure at their 3' ends, may be seen below in Formula I. 
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in which the following symbols have the following meanings: 

so X represents the overlapping 3' ends of the complementary bases of the DNA and RNA strands. Y 
represents the length of the 5* end of the DNA strand linked to the branched rG residue of the RNA portion 
defined from the first nucleotide that is not part of the stem of the stem-loop structure (not complementary 
to another base) to the last nucleotide of the 5' end. Z represents a portion of the stem of the stem-and- 
loop structure which includes the rG residue. S represents a portion of a typical stem-and-ioop structure in 

55 the DNA portion. L represents a portion of a stem-and-loop structure of the RNA portion. W represents the 
length of the RNA strand from the internal rG residue to the first nucleotide of the stem-loop structure in the 
RNA portion of the molecule, i.e., to the first of such structures when more than one is present. Wi, W2, 
W 3 , etc. represents the length oTthe RNA strand between two consecutive stem-loop structures in the RNA 
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portion, when more than one is present in the W portion of that strand. All lengths are determined in 
numbers of nucleotides. V represents the length of the RNA strand extending (or positioned) between the 
portion of complementary bases (X) and the first nucleotide of the stem (of the stem-loop in the RNA 
portion) between the first nucleotide of the stem-loop structure closest to the first complementary base in 
5 the 3' end of the RNA strand. All lengths are determined in numbers of nucleotides. Q represents the length 
of the DNA strand from the last complementary base (remote from the 3' end) in the DNA extending (or 
positioned) between the portion of complementary bases (X) at the 3' end and the first nucleotide of the 
stem (of the stem-loop in the DNA portion) to the first nucleotide of the first stem-loop structure in the DNA 
strand. Qi , Cb, Cb. etc. represents the length of the strand between two consecutive stem-loop structures in 
io the Q portion of the DNA strand of the molecule when more than one such structure is present in the Q 
portion of that strand. All lengths are determined in numbers of nucleotides. 

AH of the above lengths can vary considerably from one msDNA molecule to another. 

The number of the stem-loop structures in the DNA and in the RNA portions may vary depending on 
the number of inverted repeats in the msd and the msr regions of the retron. However, their presence is not 
is essential. To the extent that the IR may have non : complementary bases, this fact will be reflected in the 
stem portion of the respective stem-loop structure as shown in the Figures by a loop or non-pairing bases 
of the stem. There may be one or more such non-pairing loops. 

The length of X may vary as described herein depending on the extent of overlap of the of the IR which 
constitutes the 3' end of the respective strands. Likewise, the length of Z and/or Y and/or L may vary 
20 considerably between individual msDNAs. 

The length of X in number of nucleotides among known msDNAs ranges from 5 to 11. In Ec73, it is 5; 
in Mx65 and Ec107 it is 6; in Ec67, it is 7; in Mx162 and Sa163, it is 8 and in Ec86, it is 11. The length of X 
in number of nucleotides can also vary outside of the range stated above. 

The length of Y in number of nucleotides among known msDNAs ranges from 7 to 69. In Ec73 and 
25 Ec107, Y is 7; in Mx162 and Sa163, Y is 13; in Ec86 it is 15; in Mx65 it is 16; and in Ec67 it is 19. The 
length of Y in number of nucleotides in Ye1 17 is 69. 

It is contemplated that other msDNAs can have longer or shorter Ys provided the basic common 
conserved features are not adversely affected. 

The number of stem-and-loop structures (S) in the DNA portion of known msDNAs is 1. The length of S 
30 in number of nucleotides varies among known msDNAs from 34 to 136. The length of S in number of 
nucleotides in Ec67 and Ye117 it is 34. In Mx65 it is 35; in Ec86 it is 53; in Ec73 it is 56; and in Ec107 it is 
89. And the length of S in number of nucleotides in Mx162 and Sa163 is 136. 

The number of stem-and-loop structures (L) in the RNA portion of the msDNAs of the invention is at 
least 1. The length of L in number of nucleotides ranges among the, msDNAs of the invention from 9 to 26. 
35 In Ec86, the lengths are 9 and 18; in Mx65 it is 18; in Ec67 and Ye117 it is 26 nucleotides; in Mx162 and 
Sa163 they are 15 and 20; in Ec107 they are 20 and 20; and in Ec73 they are 10 and 15. It is contemplated 
that other msDNAs can have a greater number and/or different length stem-loop structures in the DNA 
and/or RNA portion. 

The total number of nucleotides in the msDNAs may vary over quite a range. Presently in known 

40 msDNAs, the number of nucleotides ranges between 114 and 239. Likewise, the number of nucleotides of 
the RNA and the DNA portions varies between 49-82 and 65-163, respectively. The number of nucleotides 
of either or both portions can be varied, i.e., lengthened or shortened. Such larger or smaller msDNAs can 
be prepared from in vivo or in vitro synthesized msDNAs. 

Illustrated herein are msDNAs as follows. Mx162 which has 162 DNA nucleotides and 77 RNA 

45 nucleotides; Mx65 which has 65 DNA nucleotides and 49 RNA nucleotides; Sa163 which has 163 DNA 
nucleotides and 76 RNA nucleotides; Ec67 which has 67 DNA nucleotides and 58 RNA nucleotides; Ec86 
which has 86 DNA nucleotides and 82 RNA nucleotides; Ec73 which has 73 nucleotides and 75 RNA 
nucleotides; Ec107 which has 107 DNA nucleotides and 75 RNA nucleotides; msDNA-Ye which has a total 
of 175 nucleotides (117 DNA and 58 RNA); msDNA-100 which has a total of 143 nucleotides (83 DNA and 

so 60 RNA); and msDNA 101 which has a total of 130 nucleotides (70 DNA and 60 RNA). 

Another variable of the msDNAs is the overlap of complementary nucleotides of the DNA and RNA 
strands at their respective 3' ends which are non-covalently linked. The minimum number of overlapping 
complementary nucleotides found to date is 5 and the maximum is 11. For example, Ec73 has 5 
overlapping bases. Mx65 and Ec107 have six. Ec67, msDNA-100, 101 and Ye117 have seven overlapping 

55 base-pairs. Mx162 and Sa163 have 8. Ec86 has 11. 

With respect to the 5' end of the RNA strand, the lengths of the strands to the internal branched rG 
residue can also vary. The minimum number of residues counting from the 5* end of the RNA prior to the 
rG nucleotide residue found to date is 3 and the maximum is 19. The position of the branched G residue at 
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the 5' end of the RNA strand is 4 from the 5' end for Mx65. For Ec86, the branched G residue is positioned 
at 14. For Ec67 and Ec73, the branched G residue is positioned at 15. For Ed 07, the branched G residue 
is positioned at 18. For Sa163, the branched G residue is positioned at 19. And for Mx162, the branched G 
residue is positioned at 20. For msDNA-100, 101 and Ye117, the branched G residue is at residue 15. 

5 With respect to the !R (a1 and a2), they too can vary in lengths. The minimum length of the repeat 
found to date is 12 and the maximum length is 34. The length of the inverted repeat in the retron of Ec86 is 
12. For Ec67 and Ec73, the length is 13. The length in Mx65 is 15. The length in Ec107 is 16. And the 
lengths of the inverted repeats in Sa163 and Mx162 are 33 and 34, respectively. 

Likewise, the distance between the msr-msd region and the ORF in various msDNAs can vary 

w significantly. The minimum distance between" the msd and ORF found to date is 19 and the maximum 
distance 77. For example, in msDNA-Ec86, 19 nucleotides separate the ORF from the msd . In Mx65, the 
distance is 28. For Ec107, the distance is 50. The number of separating nucleotides in Ec67 is 51. For 
Ec73, it is 53. For Mx162, 77 nucleotides separate the ORF from the msd . 

For a listing of the variations of common features of the msDNAs of the invention, reference is made to 

75 Table 1. 

Providing the essential functional components of the msDNAs are preserved, i^v those that are 
essential for synthesis and uses of the msDNAs, the other components of the msDNAs can be varied as 
desired. Thus, insertions and/or deletions in the msr and/or msd regions or outside of the regions on the 
nucleotide sequence in which the retron is positioned, result in msDNAs variants which retain the common 
20 generic features. 

As discussed below, for instance, insertions of nucleotide sequences at any site in the DNA and/or RNA 
portions (by appropriate insertions in the msd and/or msr genes) can produce very useful msDNAs that can 
serve as antisense vectors. 

Further, it should be noted that the ranges and nucleotide numbers given hereinabove do not include, 

25 but for species Ye117, FIG. 6(h), exogenous DNA or RNA fragments which can be inserted in the DNA 
and/or RNA portion of the msDNAs, or genes that will be found in final msDNAs as stem-loop structures. 

Variations in the msr-msd (or in the msr or msd) region (or outside thereof) cause corresponding 
variations in the RNA Iran script. VariationslfTthe dNTPs (in a cell-free system) are reflected in the RNA 
portion of the molecule. All such variations of the basic msDNA are considered within the invention. For 

30 instance, when RNase A is not added to the reaction mixture in a cell-free synthesis of msDNAs, an msDNA 
is formed that contains a double-stranded segment in what is considered the RNA portion. All such and 
other variants of the generic msDNA molecule are considered within the scope of the invention. 

msDNAs are encoded by a retroelement which has been designated as retrons. The retrons contain 
msr and msd genes, which code for the RNA and DNA strands, respectively, of msDNA. The two genes are 

35 positioned in opposite orientation. The retron comprises also an ORF encoding a polypeptide which has 
reverse transcriptase (RT) activity. The initiation codon of, the ORF is situated as close as 19 base-pairs 
from the start of the msd gene for certain msDNAs, like in Ec86, but as distant as 77 base-pairs in other 
msDNAs like in Mx162. The ORF is situated upstream of the msd , but may also be situated downstream of 
the msd locus (i.e., downstream and upstream respectively of the msr locus). When the ORF is positioned 

40 in front or upstream of the msr region, increased yield of the msDNAs are obtainable in eucaryotic cells 
such as yeast. 

The msDNAs are derived from a much longer precursor RNA (pre- msd RNA), which has been shown to 
form a very stable stem-and-loop structure. This stem-and-loop structure of pre- msd RNA serves as a 
primer for initiating msDNA synthesis, and as a template to form the branched RNA-linked msDNA. 

45 Transcription of msr-msd region of the retron, which forms the pre-msdRNA, initiates at or near the 5' end 
of msr, thus encompassing the upstream region of the msr and extends beyond msd to include the ORF. 
Their end sequence of the msr-msd transcript (base 1 "to - 13) forms a duplex with the 3' end sequence of 
the same transcript, this serving as a primer as well as a template for msDNA synthesis by RT. The 
promoter for the msr-msd region is upstream of msr, and transcription is from left to right, encompassing 

50 the entire region including the RT gene downstream of msr. 

The RTs described herein are capable of forming a~branched-linkage between the 2'-OH and any of the 
internal rG residues and the S'-phosphate of the first deoxyribonucleotide triphosphate. This unique and 
quite unusual property of the RTs are further described herein below in conjunction with the synthesis of 
msDNAs. 

55 The proposed mechanism of synthesis of the msDNAs comprises transcription of a long primary mRNA 
transcript beginning upstream from and including the msr region of the retron, extending to and including 
the msd region and including the ORF encoding the RT; folding of the primary mRNA transcript into stable 
stem-loop structures between and by means of two inverted repeat sequences, which folded mRNA 
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transcript functions both as a primer and template for cDNA synthesis by RT; forming a branched linkage 
between the 2'-OH of an internal rG residue and of the 5* phosphate of the first deoxyribonucleotide of the 
cDNA strand and continuing cDNA synthesis by RT using the folded RNA as a template, with removal of 
the RNA template within the growing DNA/RNA duplex by means of RNase H processing and termination of 
5 msDNA synthesis. It is believed that activity of the RNase may be concomitant with that of the RT but this 
is not necessarily so. 

The RT is capable by itself of synthesizing a cDNA (in this example, an msDNA molecule) utilizing the 
folded primary transcript, which transcript functions both as a primer and a template, by initiating synthesis 
with the formation of a unique 2',5 , -linkage between the first deoxyribonucleotide residue of the cDNA 
w molecule, and an internal rG residue of the particular msDNA molecule in the case of msDNA-Ec67, the 
15th residue. 

This activity is in contrast to known retroviral RTs, which are reported to initiate synthesis by formation 
of a 3\5'-linkage. 

The primary transcript from the msr-msd region is folded to form a stem structure between the region 

15 immediately upstream of the branched rG residue and the region upstream of msd in such a way that the 
rG residue , is placed at the end of the region upstream of msd in such a way that the rG residue is placed 
at the end of the stem structure. In the mechanism described, msDNA synthesis is primed and cDNA 
synthesis commences from the 2*-OH of an rG internal residue using the bottom RNA strand as a template, 
this reaction being mediated solely by the RT encoded by the ORF of the primary transcript. 

20 The synthesis of msDNAs, is initiated by an intramolecular priming event which starts at an internal 
guanosine residue (rG). The double-stranded nature of the 5* end of the folded primary transcript (due to its 
inverted repeat) functions as a priming site recognized by the RT. 

In the situations where it is desired to use the RTs herein described to transcribe a non-self-priming 
template (or an mRNA that does not carry a poly(A) tail at the 3' end), it will be necessary to provide a 

25 suitable primer which will anneal to the mRNA in a conventional manner to provide the initiation site for the 
RT described herein to synthesize the single-strand cDNA along the template. 

For background and prolocals on synthesis of cDNA and reverse transcript, see Molecular Cloning: A 
laboratory Manual ("Maniatis") pages 129-130 and 213-216 (incorporated herein by reference). If it is 
( t desired to separate any RNase activity when such is present, the protocols referred to in Maniatis in the 

30 Chapter on Synthesis of cDNA may be referred to (page 213). See also Marcus et al. , J. Virol. , 14, 853 
(1974) and other references cited at page 213. Other protocols are known in the art, such as including in the 
* reverse transcription reaction mixture an inhibitor of RNase, such as vanadyl-ribonucleoside complexes or 
RNasin.. 

For further protocols, see Molecular Cloning: A laboratory Manual ("Sambrook"), Vol 1, Units 5.34, 5.52- 
35 5.55 for RTs (RNA-dependent polymerases); Units 7.79-7.83 for RNA primer extension, Vol. 2, Units 8.11- 
8.13, 8.60-8.63, 14.20-14,21 for first strand cDNA synthesis and 10.13 for synthesis of DNA probes with 
ssRNA template and 7.81 (and B.26) for suitable buffers (incorporated herein by reference). 
; RNA-directed DNA polymerases can be purified by methods known in the art. See Houts, G.E., Miyagi, 

M., Ellis, C, Brand, D., and Beard J.W. (1979), J. Virol. 29, 517. Also see Current Protocols in Molecular 
40 Biology, Vol. 1 ("Protocols"), Units 3.7.1 - 3.7.2 for description of RT isolation and purification (incorporated 
herein byTeference). Roth et al., J. Biol. Chem., 260, 9326-9335 (1985) and Verma, I.M., The Enzymes , Vol. 
14A (P.D. Boyer, ed.), 87-1047Academic Press,~NY, (1977); modified to the extent necessary for the RTs 
described herein. Also see, BRL Catalogue, page 17 (1985). Isolation & Purification of RTs described herein 
was according to the method described by Lampson et al., Science , 243 , 1033-1038 (1989b). See also, 
45 Lampson et al., J. Biol. Chem. , 265 , 8490-8496 (1990). 

Therels "provided hereinafter additional description of the various features identified above. 
An RTs of the retrons show significant similarities in their amino acid sequences and the sequences 
found in retroviral RTs . They also contain the highly conserved polymerase consensus sequence YXDD 
found in all known RTs. 

50 While the ORF is a common feature of the retron, the length of the ORF region can vary among the 
individual retrons for Instance, from a minimum of 948 to a maximum of 1 .758 nucleotides. The ORF region 
of MsDNA-Ec73 is 948 nucleotides in length; in Ec107, it is 957; in Ec86, it is 960; in Mx65, it is 1,281; in 
Mx162, it is 1,455 nucleotides in length and the ORF region of Ec67 is 586 nucleotides in length. The 
minimum number of nucleotides necessary is that which is effective to encode a polypeptide that has RT 

55 activity sufficient to contribute to the synthesis of the msDNA using the transcript as described herein. 

Due to the extensive size differences of the RT ORFs, the domain structures are quite diverse. All but 
one RT (from msDNA-Ec67 (RT-Ec67)) do not contain a RNase H domain. All RTs from myxobacteria 
contain an extra amino terminal domain of 139 to 170 residues, while RTs from E. coli (except for RT- 
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Ec67) have only an RT domain and consist of 316 to 320 residues. The amino acid sequence of RT-Sal63 
consists of 480 residues which shown 78% identity with the sequence of RT-Mx162. 

It is noteworthy in that certain of the RTs described herein can use heterologous msDNAs as template 
primers to extend the synthesis of msDNA in vitro. For instance, purified Ec67-RT can use heterologous 
5 msDNAs, Ec86 and Mx162 for such purpose. 

As has been noted above, the msr-msd region and the RT gene can be expressed under independent 
promoters to produce msDNAs. However, the msr-msd region for the production of msDNA-Ec67 can only 
be complemented by the RT-Ec67, but not by the RT-Ec73 gene or vice versa. This specificity may be due 
to the priming reaction for each msDNA. With respect to promoter(s), msr-msd and the RT may be driven 
w by a single promoter upstream of the msr-msd region or the RT gene may be driven by a separate 
promoter, for instance the yeast Ipp-lac promoter. The promoter may be an endogenous or a foreign 
promoter like the GAL10 promoter. 

There will be described hereinefter various methods of synthesizing the msDNAs. A number of methods 
of making the msDNAs of the invention are described in our earlier patent applications. For example, the 
is msDNAs of the invention have been synthesized in vivo in suitable prokaryotes or eukaryotes. The msDNAs 
can also be synthesized by chemical methods, in cell permeabilized or in cell-free systems: • 

A cell-free method of making msDNAs in vitro comprises permeabitizing the membrane of a suitable 
prokaryotic cell by treating with a suitable cell permeabilizing agent, incubating said permeabilized cells in a 
reaction mixture with suitable substrates for msDNA synthesis and isolating and purifying the synthesized 
20 msDNA of the invention. Detailed description is found in pending application Serial No. 07/315,427, referred 
to above. 

The msDNAs of the invention may be synthesized in vivo in suitable host cells. The msDNAs may be 
synthesized in prokaryotic cells. Likewise, eukaryotic cells may also be utilized to synthesize the msDNAs 
of the invention. See U.S. Serial No. 07/753,1 10. The method of making the msDNAs of the invention in vivo 

25 comprises culturing a suitable cell containing a rDNA construct encoding a prokaryotic msDNA synthesizing 
system, which includes the ORF region. See U.S. Serial No. 07/517,946). The synthesizing systems include 
retrons which encode the msDNAs of the invention. See U.S. Serial No. 07/518,749.. The cells were 
^^feE^i - transformed with suitable plasmids constructed with the retrons. msDNAs typical of the invention, as shown 

in FIGS. 1, 2 and 3. After synthesis they are isolated and purified. 

30 Suitable prokaryotic cells for the in vivo expression of the msDNAs of the invention are M. xanthus and 
E. coli. Suitable eukaryotic cells are~yeasts, e.g., Saccharomyces cerevisiae. Any suitable~prokaryotic or 
eukaryotic cells capable of expressing the msDNAs of the invention can be used. 

There will be described hereinafter, several interesting utilities of the RTs and msDNAs. The RTs 
described herein (and in earlier above-referred to patent applications) are believed to be valuable in assays 

35 for screening different molecules for their effect to inhibit or block the activity of the RTs both in vivo and in 
vitro. , ~~ 

Since the msDNA production in vivo can be monitored either biochemically or genetically, it can be 
used for screening drugs which block the msDNA synthesis. If such drugs are found, they may block the 
msDNA synthesis at the unique 2'-OH priming reaction (the formation of a 2\5'-phosphodiester linkage) or 

40 the extension of msDNA synthesis (or cDNA synthesis). 

An in vitro assay is thus provided which comprises a suitable template, RNA or DNAs (or a molecule 
which contains such a template), an RT of the invention and a molecule to be screened (and other 
conventional components) to allow the RT activity to manifest itself. The greater the inhibitory or blocking 
effect of the screened molecule(s) on the RT activity, the more likely the molecule will be as a useful 

45 candidate for biological and medical applications where it is sought to inhibit a disease due to a retrovirus 
such as HIV, HTLV-I and others. 

Since the synthesis of msDNAs are RT-dependent, the molecules to be screened for effect on RT 
activity can be tested in the vitro synthesis of the msDNAs. The extent to which that synthesis is inhibited 
determines the effectiveness of the molecule(s). 

so Another suitable molecule that can be used as a template, are single-stranded DNA cloning vehicles 
such as the M13 cloning vectors (mp 8). See Molecular Cloning: A laboratory Manual ("Sambrook"), Vol. 1, 
Units 4.33-4.38 for cloning foreign DNA into bacteriophage M13 vectors and Unit 1.20, Bluescript M13 + and 
M-13~ (incorporated herein by reference). 

Since the msDNA is produced in several hundred copies per retron, the msDNA can be used for gene 

55 amplification. This can be performed by classic protocols by splicing into the stem portion of a stem-and- 
loop region of the msDNA, a double-stranded DNA containing a gene or by replacing a portion of the stem- 
and-loop region by a double-stranded DNA containing a desired gene. For instance, in the synthetic 
msDNAs shown in FIG. 3a and 3b, the stem-and-toop region of the msDNA can be cut out by an 
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appropriate restriction enzyme from the retron containing plasmid DNA having restriction site Xhol and 
Sacll. A DNA fragment is then ligated into the site which contains two copies of a gene of interest either in 
head to head or tail in tail orientation. When this region is copied as a single-stranded DNA into the msDNA, 
a stem-and-loop structure is formed because of the palindromic orientation of the two copies of the genes. 

5 Thus, the gene of interest is reconstructed in the stem structure. By this method of gene amplification, a 
large number of genes can be produced. When the msDNA structure is not foreign to E. coli, this 
microorganism is particularly well suited as a vehicle for gene multiplication. 

In another application, the msDNAs of the invention are used to produce stable RNAs. 

The msDNAs of the invention are useful in the production of polypeptides and in the production of 

w antisense molecules. Polypeptides will be produced from DNA fragments inserted in the retron such that 
the sense strand is transcribed. 

The msDNAs of the invention are useful as a vector for the production of recombinant antisense single- 
stranded DNAs. Antisense molecules will be produced from DNA fragments inserted in the retron in an 
orientation such that the primary transcript contains an antisense strand. Such msDNAs are especially 

is useful because of their stability in contrast to antisense molecules known to date which are comparatively 
unstable. Selected DNA fragment can be inserted into the msr or the msd region of the retron. The primary 
transcript in turn will contain an RNA sequence complementary to the inserted DNA fragment in addition to 
the msr, msd regions and the ORF for the RT. The DNA fragment containing the gene of interest can be 
inserted in either orientation such that the primary transcript will contain either the sense strand and function 

20 as an mRNA, i.e., produce a polypeptide, or the antisense molecule strand, in which case the primary 
transcript will anneal to the target gene mRNA and inhibit its expression. The msDNA produced therefrom 
operates as an antisense against the mRNA produced in vivo from the target gene and thus can be used to 
regulate the expression of the gene in vitro. The DNA fragment can be inserted into either the msr or the 
msd region of the retron. The target gene will then be expressed in the RNA or DNA portion of theThsDNA, 

25 respectively. The insertion into the msr or msd region can be performed in any suitable restriction site as is 
known in this field of technology. 

The expression of an antisense molecule as an msDNA is highly advantageous. Antisense molecules 
produced to date are known to be inherently unstable and rapidly degraded. In contrast, the msDNAs 
" v ] carrying the antisense fragment exhibit remarkable stability. Such msDNAs contain in either the RNA or the 

30 DNA portion an antisense strand that is complementary to and capable of binding or hybridizing to and 
inhibiting the translation of the mRNA of the genetic material or target gene. Upon binding or hybridizing 
with the mRNA, the translation of the mRNA is prevented with the result that the product such as the target 
protein coded by the mRNA is not produced. Thus, the msDNAs provide useful systems for regulating the 
expression of any gene and contribute to overcoming the problem of lack of stability associated with 

35 antisense molecules of the prior art. 

For an illustration of an msDNA carrying an antisense fragment, reference is made to pending patent 
application Serial No. 07/753,1 10. Any of the msDNAs can be used for that purpose. 

When it is desired to insert a DNA sequence in an msDNA for encoding a protein (polypeptide) e.g., 
two copies of a gene will be inserted in tandem and in opposite, orientation with respect to another at a 

40 selected restriction site into the msd sequence of an msDNA of choice, such as YEp521-M4. 

The msDNAs may be useful in HIV therapy as follows. Healthy lymphocytes are taken from a patient 
and stored. When needed by the patient, the msDNAs would be proliferated; a DNA construct is inserted 
into the msDNA which would produce an antisense against one of the HIV essential proteins, then transfuse 
these lymphocytes back into the patient. Thus, a growing population of lymphocytes develop which will be 

45 resistant to HIV. 

For literature and other references relating to antisense RNA and its application in gene regulation, see 
for instance: Hirashima et al., Proc. Natl. Acad. Sci. USA , 83, 7726-7730 (October 1986) and Inouye, Gene , 
72, 25-34 (1988); and "European Patent Application A2~0 140 308, published May 8, 1985, entitled 
Regulation of gene expression by employing translational inhibition utilizing mRNA interfering complemen- 
so tary RNA" , based on U.S. Patent applications Serial No. 543,528 filed October 20, 1983 and Serial No. 
585,282 filed March 1, 1984, which are incorporated herein by reference. 

For an up to date report on antisense, see Antisense Research and Development, 1, 207-217 (1991), 
Hawkins and Krieg, Editors; Mary Ann Liebert, Inc., Publishers. See "Meeting Report: Gene Regulation by 
Antisense RNA and DNA", for a listing of patents in that field, see "A Listing of Antisense Patents, 1971- 
55 1 991 " therein. The msDNAs of the invention are useful in numerous applications described therein. 

A fascinating utility considered is the role of the msDNAs of the invention in the formation of triple-helix 
DNA, or triplex DNA with a specific duplex on the chromosome. A recent report in Science, 252, 1374-1375 
(June 27, 1991), "Triplex DNA Finally Comes of Age", highlights the timeliness of the present invention. 
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Triplex DNA can be formed by binding a third strand to specific recognized sites on chromosomal DNA. 
Synthetic strands of sizes preferably containing the full complement of bases (such as 11-15 and higher), 
are discussed. The msDNAs of the invention appear to be excellent candidates for such applications. The 
msDNAs provide single-stranded DNAs necessary for triplex formation. The resulting triplex DNA is 
5 expected to have increased stability and usefulness. New therapies based on the triple-helix formation, 
including the AIDS therapy and selective gene inhibition and others are proposed in the Report. 
Other applications can be envisioned by one skilled in the art. 

A third embodiment of the invention relates to a cell-free synthesis of a typical msDNA. The method 
comprises reacting a total RNA preparation containing the msr-msd region and a purified RT (Ec67-RT) 

10 under conditions suitable for the reaction. 

Using this cell-free system, the priming reaction, during initiation of DNA synthesis, was demonstrated 
to be a specific template directed event. Only dTTP was incorporated into a 132-base precursor RNA 
yielding a 133-base compound. This specific dT addition could be altered to dA or dC by simply 
substituting the 118th A residue of the putative msr-msd transcript with a T or G residue. The priming 

/5 reaction was blocked when A was substituted for~Gf at the 1 5th residue of the precursor RNA transcript 
which corresponds to the branched rG residue in msDNA. DNA chain elongation could be terminated by 
adding ddNTP in the cell-free system, forming a sequence ladder. The DNA sequence determined from this 
ladder completely agreed with the msDNA sequence. A part of the fully extended cell-free product 
contained a 13-base RNA strand resistant to RNase A, which was consistent with the previously, proposed 

20 model. In this model the 5'-end sequence of the msr-msd transcript (base 1 to 13) forms a duplex with the 
3'-end sequence of the same transcript, thus serving as a primer as well as a template for msDNA 
synthesis by RT. 

As described hereinabove, the msDNA synthesis is primed from the 2'-OH residue of the rG residue 
using the bottom RNA strand as a template. The first base or the 5'-end of msDNA is determined by the 
25 base at position 118 in Fig. 1C. Thus, the synthesis of msDNA-Ec67 starts from a dT residue, complemen- 
tary to the rA residue at position 1 18. See Fig. 1C. 

In other msDNAs, the internal G residue occurs at different locations as described above. Thus, in the 
synthesis of other msDNAs the synthesis starts at the base in the DNA which is complementary to the first 
base in the RNA strand. 

30 There is established first a cell-free system for the synthesis of msDNA-Ec67 using partially purified 
RT-Ec67 and the RNA fraction prepared from cells harboring p67-BH0.6. This plasmid contained the msr- 
msd region and a truncated RT gene from retron-Ec67 as described in the Examples. As shown in Fig~2, 
[a- 32 P]dTTP (lane 3) was specifically incorporated into a product migrating at the position of. 133 
nucleotides in size. Neither [a- 32 P]dGTP (lane 1) nor [a- 32 P]dATP (lane 2) was incorporated in the cell-free 

35 system. In the case of [a- 32 P]dCTP (lane 4), two minor bands appeared at positions shorter by 4- to 5- 
bases than the major product labeled with [a- 32 P]dTTP (lane 3). As discussed later, these products were 
labeled even in the absence of the branched rG residue (lane 16, Fig. 2), indicating that these were not 
associated with msDNA synthesis. When RT was omitted from the reaction mixture, no labeled bands were 
detected with [«- 32 P]dTTP. 

40 The size of the product labeled with dTTP agrees well with that of the structure proposed in Fig. 1C; the 
folded RNA precursor consists of 132-bases and the addition of a dT residue to the RNA molecule yields 
an oligonucleotide consisting of 133-bases (see also structure ill in Fig. 3). 

Two mutations were then constructed; in the first mutation the rA residue at position 118 of the 
precursor RNA molecule (Fig. 1C) was substituted with an U residue (mut-1) and in the second mutation 

45 with a rG (mut-2). When the mut-1 RNA preparation was used for the priming reaction, [a- 32 P]dATP was 
specifically incorporated into the major product (lane 6, Fig. 2) migrating at the same position as the 
product labeled with [a- 32 P]dTTP using the wild-type RNA fraction (lane 3). Similarly [a- 32 P]dCTP was 
specifically incorporated with the mut-2 RNA preparation (lane 12). It should be noted that the A to G 
substitution in mut-2 resulted in three consecutive rG residues on the template strand of the RNA molecule 

so (from base 1 16 to base 118; see Fig. 1C). Therefore, one to three dC residues are expected to be added to 
the precursor molecule. Indeed, the band labeled with [a- 32 P]dCTP in lane 12, Fig. 2 was much broader 
towards higher molecular weights than the wild-type product (lane 3) and the mut-1 product (lane 6). Thus, 
it appears that the nature of the residue at position 118 (in the case of that msDNA) is not critical. 

Previously it was demonstrated that the branched rG residue is essential since the substitution of the G 

55 residue with an A residue completely blocked the synthesis of msDNA-Mx162 in vivo. Similarly, in the 
present cell-free system, the G to A substitution (mut-3 at position 15 in Fig. 1C) compiitely abolished the 
specific [a- 32 P]dTTP incorporation into the precursor RNA (compare lane 15 with lane 3 in Fig. 2). Doublet 
bands are still produced with [a- 32 P]dCTP even with the mut-3 RNA preparation (lane 15), indicating that 
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these bands are not associated with msDNA synthesis. 

In all known msDNAs from both myxobacteria and E. coli, the base directly opposite to the branched G 
residue in the folded RNA precursor is always an rG residue without exception (residue 119 in Fig. 1C). 
When this rG residue at position 119 was changed to A (mut-4), the specific dT incorporation was still 

5 observed but the incorporation was substantially reduced (approximately to 5% of the wild-type incor- 
poration). This indicates that this rG residue on the template strand plays an important role in the priming 
reaction. When the products from the priming reaction in Fig, 2 were digested with RNase A, all yielded 
products of small molecular weights migrating almost at the front of gel electrophoresis. 

Thus, the studies described above clearly demonstrate that the first base was added to the precursor 

70 RNA molecule in a specific manner such that the first base is complementary to the base positioned at the 
A residue in structure II in Fig. 3. Furthermore, the addition of the first base is absolutely dependent upon 
the RT preparation added in the reaction and also upon the rG residue (circled in Fig. 3) at the end of the 
a1-a2 stem. The T residue (complementary to the rA residue at position 118) linked to the branched rG 
residue then serves as a primer to further extend the DNA chain along the RNA template. As the DNA 

75 strand is extended, the RNA template is concomitantly removed as shown in structure III so that the total 
number of bases of structure III is almost identical to that of structure II. 

In other DNAs, likewise the base residue complementary to the base residue is a position equivalent to 
1 18 is the base from which the DNA chain extends along the RNA template. 

In order to confirm this model, the chain elongation reaction was carried out using the same cell-free 

20 system as used for the first base addition in Fig. 2; in addition to [a- 32 P]dTTP, three other dNTPs as well as 
dideoxynucleotides (ddNTPs) were added for separate chain-termination reaction ( Sanger et al., 1977). 
After the chain-elongation reaction, the products were treated with RNase A to remove single-stranded RNA 
attached to them. As can be seen in Fig. 4, a ladder is formed, clearly indicating that the DNA chain was 
elongated using a specific template sequence. The sequence determined from the ladder is identical with 

25 the DNA sequence from base 24 to base 54 of msDNA-Ec67 (Fig. IB). Although some termination of 
msDNA synthesis occurred at around positions 42 to 44, most of the reaction terminated at around position 
- 69 forming a strong band in all lanes at position (a). This product is most likely the fully extended msDNA- 
Ec67 ( 67 "base single-stranded DNA) that is linked to a 4-base RNA, AGAU resulting from RNase treatment 
V '" (structure IVa in Fig. 3). The DNA strand is considered to be branched out from the 2'-OH group of the G 

30 residue of the tetranucleotide. Every band in the sequencing ladder migrated at a position longer by 2- 
bases than what was expected from the size of the DNA strand. This was probably caused by the extra 4- 
base RNA attached at the 5'-end of the DNA strand. The 2-base discrepancy in the mobility in the gel is 
likely to be due to the branched RNA structure. 

RNA Structure at the 5'-End - The structure of msDNA-Ec67 produced in vivo has been determined as 

35 shown in Fig. 1B (Lampson et al., 1989b), which corresponds to structure IVa in Fig. 3. On the basis of the 
proposed model shown in Fig. 3, structure IVb may also be produced, in which the 5'-end arm of the 
msdRNA (upstream of the branched rG residue and the sequence from base 1 to 14 in Fig. 1B) forms a 
double-stranded RNA (1 4-base pair) which represents the remaining a1-a2 stem structure from the folded 
precursor RNA template. In Fig. 4, band (b) migrated at around 82-bases, which is longer by 13-bases than 

40 band (a). Since the double-stranded RNA is resistant to RNase A and heating prior gel electrophoresis 
dissociated 1 4-base RNA from msDNA, the entire 5'-end arm remained with the DNA strand (see Fig. 3). 
Thus, band (a) and (b) products consist of 71- and 84-bases, respectively, which migrated at 69- and 82- 
base positions, respectively, in Fig. 4. 

To unambiguously prove the existence of structure IVa, the band (b) product was extracted from the 

45 gel, and retreated with RNase A. As shown in Fig. 5, the purified band (b) product (lane 3) changed its 
mobility to the band (a) position in a sequencing gel when it was treated a second time with RNase A (lane 
5). No change in the mobility was observed before and after RNase treatment of band (a) (lanes 4 and 6, 
respectively). 

Interestingly, the size difference between band (d) and (c) in Fig. 4 is also approximately 13-bases; the 
so size difference between band (d) and (b) or between band (c) and (a) is approximately 35-bases. On the 
basis of these sizes, the band (c) product is likely a result of further extension of the single-stranded DNA 
all the way to the branched G residue using the msdRNA as a template (see Fig. 3). This extension 
elongates the msDNA by another 35-bases at its 3'-end, which agrees well with the size of band (c). Such 
DNA elongation from the 3'-end of msDNA has been demonstrated for msDNA-Ec67 with a partially purified 
55 RT-Ec67 (Lampson et al., 1990). Thus the band (d) product is considered to consist of the fully extended 
msDNA strand (102 : bases) plus the 17-base RNA similar to the RNA structure of the band (b) product 
(structure Vlb in Fig. 3). 
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The above studies show the complementation in a cell-free system using the RNA fraction from cells 
harboring p67-BH0.6 and RT partially purified from cells harboring pRT-67. The cell-free synthesis of 
msDNA-Ec67 was initiated de novo by the bacterial RT and from the expected first base. The following 
features of the cell-free system of synthesis of the msDNAs described are particularly noteworthy: (1) The 

5 incorporation of the first dNTP for the primary reaction for msDNA-Ec67 as well as further extension of the 
DNA chain is absolutely dependent upon the addition of RT and the RNA fraction containing the transcript 
from the msr-msd region. If either of them was omitted from the reaction mixture, the specific incorporation 
of the first base (dTTP for the wild-type msDNA-Ec67) into the precursor molecule was not observed. (2) 
The first base linked to the precursor RNA molecule is determined by the 118th-base of the primary RNA 

10 transcript from the msr-msd region serving as a template (see Fig. 1C). For other msDNAs it is the base 
corresponding to that in the 118th position in this msDNA species. The first base is always complementary 
to the base at the 118th position of the precursor RNA molecule. (3) The 15th residue of the primary 
transcript is a G residue and is essential for the priming reaction. This G residue corresponds to the 
branched G residue of msDNA-Ec67 (see Figs. 1B and 1C). In other msDNAs the G may^be positioned at 

75 other positions as described. (4) The compound to which the first dNTP, determined by and complementary 
to the 118th-base in the primary transcript, is linked, is sensitive to RNase A and detected as a single band 
in acrylamide gels. From its mobility the compound appears to consist of 133-bases. (5) When all four 
. dNTPs are added in the reaction mixture, the DNA chain is elongated and the major product from this 
reaction is estimated to consist of approximately 69-bases. (6) When ddNTPs are added in the elongation 

20 reaction in addition to four dNTP, a sequencing ladder is formed, and the sequence read from the ladder 
completely matches with the DNA sequence of msDNA-Ec67. (7) The RNA molecule attached to the 5'-end 
of the extended DNA molecule is protected from RNase A digestion. This protection from RNase A is due 
to the formation of a double-stranded structure which represents the remaning a1-a2 stem structure from 
the folded precursor RNA molecule, and thus the RNA molecule can be digested if the cell-free product is 

25 incubated in a boiling water bath prior to RNase A treatment. (8) The size of RNA removed by the RNase A 
treatment after boiling is 13-bases. 

The following Examples are given for purpose of illustration and not in any way by way of limitation on 
tne SC0 P e °f tne invention. 

30 EXAMPLE 1 



The method of in vitro synthesis of msDNA in M. xanthus is described in detail in allowed U.S. Serial 
No. 07/315,427 and incorporated herein by reference. 

35 EXAMPLE 2 

The method of in vivo synthesis of msDNA-Ec67 in yeast is described in detail in pending patent 
application Serial No. 07/753,110 and is incorporated herein by reference. 

40 EXAMPLE 3 

The method of in vivo synthesis of msDNA-Mx65 is described in detail in Dhundale, Journal of 
Biological Chemistry , "26379055-9058 (1 988). 

45 EXAMPLE 4 

The method of in vivo synthesis of msDNA-Ec67 in E. coli is described in detail in U.S. Serial No. 
07/315,432, which is incorporated herein by reference. 

50 EXAMPLE 5 



Two separate synthetic msDNA molecules were constructed. A 196-bp synthetic msDNA containing an 
entire msr-msd region was synthesized from four double-stranded oligonucleotide units. The synthetic 
genes and their components are shown in FIG. 9b. Eight single-stranded oligonucleotides, forty-six to fifty- 
55 six bases in length were synthesized. The appropriate pairs of oligonucleotides were annealed by heating at 
100 *C for 5 minutes, then cooled at 30 "C for 30 minutes and for 30 minutes at 4°C. An E. coli pINIII- 
(lpp p ~ 5 ) expression vector retron was digested with Xbal-EcoRI, and an Xbal-EcoRI fragmenrTrbm the 
clinical E. coli strain C1-1 was inserted such that the RT gene was under Ipp-lac promoter control and used 
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to transform E. coli. After identification of the clone, the 10.7-kb plNIII(lpp p " 5 ) Ec67-RT plasmid DNA was 
isolated. Thel^lTbp synthetic msDNA fragment was then inserted into the vector by digesting with Xbal, 
treating the vector ends with bacterial alkaline phosphatase and ligating the fragment into the site. The 
construction scheme is shown in FIG. 9. E. coli CL-83 was transformed with the plNIII(lpp p " 5 ) ms100-RT 
plasmid and msDNA was synthesized. This~artificial msDNA was designated ms100 and is illustrated in FIG. 
8a. 

EXAMPLE 6 
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A second synthetic msDNA, ms101, was expressed from the vector pUCK19, a derivative of pUCl9. 
pUCl9 DNA was digested with Dral and the 2-kb fragment was isolated. The isolated fragment was ligated 
to a 1.3-kb Hinfl fragment from Tn5 encoding the kanamycin resistance gene. The resultant 3.3-kb plasmid, 
pUCK19, was~digested with Xbal and the 196-bp synthetic msDNA described above in Example 9 was 
inserted. The pUCKmslOO construct was digested with Xhol and Sacll which results in the excision of a 61- 
bp fragment from within the ms100 region. A synthetic 45-mer double-stranded oligonucleotide (shown in 
FIG. 10 as ms-Cl,2) was ligated into the vector yielding pUCKms101 in which the msr-msd region is under 
lac control. The construction scheme is shown in FIG. 10. RT was provided by transforming E. coli 
containing pUCKmslOO or pUCKms101 with plNIII(lpp p " 5 ) Ec67-RT. msDNA production was detectedTn The 
cells containing these constructs. 

EXAMPLE 7 



The ability of purified Ec67-RT to synthesize DNA from various templates composed of random 
sequences was examined using three different template:primer systems. 

E. coli 5S rRNA was annealed to a synthetic 15-base oligo-DNA (15-mer) complementary to the 3' end 
of E colT 5S rRNA which served as a primer for the polymerase. The 5S rRNA template:primer was 
preparecTby mixing 30 pmoles of E. coli 5S rRNA (Boehringer Mannheim) with 120 pmoles of a synthetic 
15-base, oligo-DNA (5'-ATCCCTGGC^TTCC-3 f ). The mixture was dried, then resolubilized in 30 ul of a 
formamide solution (80% formamide, 20 mM PIPES-pH 6.5, 0.4M NaCI). The solution was then heated at 
90 "C for 10 minutes, transferred to 37 *C for 2 to 3 hours, followed by room temperature for 30 minutes. 
The annealed template:primer was then precipitated with ethanol and lyophilized. 

The annealed template:primer was added to a reaction buffer (pH 7.8) containing dNTPs and [a- 32 P]- 
dCTP. An aliquot from the glycerol gradient fraction containing the purified Ec67-RT was added to the 
reaction mixture and incubated at 37 *C for 15 minutes. The products were treated with RNase A before 
analysis by gel electrophoresis. Complete extension of DNA synthesis from the 3' end of the primer, using 
55 rRNA as a template, should give a DNA product of 120 nucleotides. FIG. 11, lane 1 shows the labeled 
products formed by the Ec67-RT after electrophoresis on a 6% polyacrylamide sequencing gel. A 
predominant band migrated at about 120 bases which was resistant to treatment with RNase A. A band of 
similar size was also produced when Avian Myeloblastosis virus-reverse transcriptase (AMV-RT) was 
substituted for the bacterial enzyme in the reaction mixture (arrow, FIG. 11, lane 4). Although there are 
several intermediate size products formed, the bacterial enzyme, like the retroviral polymerase, synthesized 
a full length cDNA of 120 bases using the 5S rRNA as a template with a 15-mer DNA as a primer. 

The Ec67-RT also polymerized DNA using DNA as a template. In this reaction a 50-base, synthetic 
DNA was annealed to a synthetic 20-mer DNA primer complementary to its 3' end. The synthetic 50-base 
oligo-DNA template (5'-CGGTAA AACCTCCCACCTGCGTGCTCACCTGCGTTGGCACACCGGTGAAA-3') 
was annealed to a complementary, 20-base oligo-DNA primer (5'-TTTCACCGGTGTGCCAA-3') in a similar 
manner. Total RNA prepared from 1.2 mis of an overnight culture of E. coli C21 10/pC1-1EP5b was used for 
a reaction in which msDNA served as a template: primer. RNA was prepared by the hot phenol method. 

This oligo-DNA template:primer was allowed to react with the Ec67-RT and the resulting products 
formed are shown in FIG. 9, lane 2. A small band appears at the bottom of lane 3, migrating at about 20- 
bases in size. This indicates that only one to three dNTPs have been added to the 20-base primer since the 
first and third bases extending from the 3' end of the primer would be expected to incorporate the labeled 
dCTP resulting in this small product. A larger, but weakly labeled band is also present at roughly 50-bases 
in size (arrow, FIG. 9). This product was resistant to treatment with RNase A and was the size expected for 
a complementary DNA extending the full length of the 50-base template. A heavily labeled band of similar 
size is also produced when AMV-RT is substituted for the bacterial enzyme in the reaction (FIG. 9, lane 5). 
The ability of the Ec67-RT to synthesize a full length cDNA from either the 5S rRNA template or the oligo- 
DNA template is dependent on a primer annealed to the template. 
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The lanes in FIG. 11 were as follows: Lane S, pBR322 digested with Mspl and 32 P-labeled with the 
Klenow fragment; lane 1, cDNA products synthesized when Ec67-RT is added to the reaction mixture 
containing E. coli 5S rRNA as template, annealed to a complementary synthetic 15-mer DNA as a primer; 
lane 2, Ec67-RTplus a 50-base, synthetic DNA as a template annealed to a 20-mer DNA primer; lane 3, 
5 Ec67-RT plus total RNA from E. coil C21107pCl-1 EP5b containing msDNA-Ec67 as a template:primer. 
Lanes 4,5, and 6 are the same reactions as those in lanes 1, 2, and 3, respectively, except that AMV-RT 
was substituted for Ec67-RT in the reaction mixture. Reactions with AMV-RT were diluted 100-fold before 
loading on the gel . 

Likewise, the other RTs disclosed herein are capable of synthesizing cDNAs from either a DNA or an 
w RNA template. 

EXAMPLE 8 

The msDNAs of the invention can be additionally synthesized in vitro in a cell-free system. msDNA- 

15 Ec67 was synthesized de novo when RT-Ec67 and a total RNA fraction containing the primary transcript 
from the msr-msd regiorTbf retron-Ec67 were isolated, mixed and incubated in the presence the of 4 dNTPs 
at a temperature suitable for the reaction (preferably physiological temperatures) in the presence of buffers. 
To remove a 5' end of the RNA transcript, the reaction product is incubated with RNase A. The detailed 
experimental protocol is hereinafter described. 

20 Bacterial Strains and Culture Media - E. coli SB221 (Nakamura et at., 1982) and C2110 (his rha polA1) 
were used. These E. coli cells harboring pTasmids were grown in L-broth (Miller, 1972) in the presence of 
ampicillin (50 ug/mlTo7~spectinomycin (50 ug/ml). 

Plasmid Construction and Mutant Isolation - To express the msr-msd region from retron-Ec67, the 
BssHII site at the base number from 181 to 186 (see FIG. 6 in Lampion et at., 1989b) was changed to a 

25 BamHI site by inserting an 8-mer-BamHI linker at the blunt-ended BssHfsiteT Subsequently, the 615-bp 
BamHI-Hindlli (base number from 795 to 800 in FIG. 6 in LampsorTet al., 1989b) was isolated. This 
fragment consists of the msr-msd region with its own promoter and a 5' end portion of the RT gene 
(encoding the N-terminal 126~-residues out of the 586 residue RT-Ec67), which was then cloned into the 
■ / : i BamHI-Hindlll sites of pSP65 (Boehringer Mannheim). The resulting plasmid was designated p67-BH0.6. In 

30 order to purify RT-Ec67, the RT gene was cloned under the Ipp-lac promoter. For this purpose, an Xbal site 
was first created 13 bases upstream of the RT initiation codorTby oligonucleotide-directed site-specific 
mutagenesis (Inouye and Inouye, 1991); TCTG (base 410 to 404 in FIG. 6 in Lampson et al., 1989b) 
changed to TCTAGA (see FIG. 1A in Lampson). Then, the resulting 3.3-kilobase (kb) Xbal-EcoRTfragment 
was cloned into the Xbal-EcoRI sites of pGB21pp p " 5 which was constructed by cloning the 1-kb Pstl-BamHI 

35 fragment from plNllllppP" 5 ~7inouye and Inouye, 1985) into the Pstl-BamHI sites of pGB2 (Churchward et al., 
1984). The resulting plasmid was designated pRT-67. Various msd-msr mutations were isolated by 
oligonucleotide-directed site-specific mutagenesis (Inouye and Inouye, 1991) using p67-BH0.6 (Fig. 1A). 
Oligonucleotides used are: 5 TGCGAAGGTGTGCCTGCA 3 ' for mutation 1 (A to T at position 118 in Fig. 1C), 
TGCGAAGGGGTGCCTGCA for mutation 2 (A to G at position 1 18 in Fig. 1C), ATGTAGGCAAATTTGTTGG 

40 for mutation 3 (branched G to A at position 15 in Fig. 1C), and TGCGAAGGAATGCCTGCAT for mutation 4 
(G to A at position 119 in Fig. 1C). 

Purification of RT-Ec67 - The RT (from Ec67) was purified by the method described by Lampson et al., 
Science, 243, 1033-1038 (1989b) (see also, Lampson et al., J. Biol. Chem. , 265 , 8490-8496(1 990))7rom 
C2110 harboring pRT-67 with some modifications. After DEAE-cellulose batch purification, the sample was 

45 applied to a Mono Q column (5 mm x 50 mm). Elution was carried out with a linear gradient of NaCI from 
250 mM to 1 M using a Pharmacia FPLC system. The RT activity was eluted between 320 mM and 350 
mM NaCI and separated. 

Isolation of the RNA Transcript from the msr-msd Region - Total RNA fraction was isolated from SB221 
cells harboring p67-BH0.6 with the method~described by Chomzynski and Sacchi (1987). This fraction 
so containing the transcript from the msr-msd region was used as the template for msDNA synthesis in the 
cell-free system. 

Cell-free System for msDNA Synthesis - To produce msDNA, a total RNA fraction from a 1-ml culture 
was added to a 10-ul reaction mixture containing RT-buffer (50 mM Tris-HCI (pH 8.3), 1 mM dithiothreiol, 
40 mM KCI, 6 mM MgCI 2 ) and 2 uCi (a- 32 P)dTTP and 2.5 mM each dATP, dGTP and dCTP were added. 
55 The reaction was started by adding 2 ul of the Mono Q-purified RT fraction. The mixture was incubated at 
37 *C for 30 minutes. The samples were analyzed by electrophoresis on a 6% acrylamide in 9 M urea 
followed by autoradiography. 
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Dideoxy Sequence Analysis during DNA Extension - A total RNA fraction prepared from a 25-ml culture 
was added to a 100-ul reaction mixture containing 100 uCi of [a- 32 P]dTTP and 20 ul of the Mono Q purified 
RT fraction in RT buffer. After incubating at 37 ° C for 5 minutes, the reaction mixture was divided into five 
tubes (20 ul each). Four tubes were used for individual chain termination reaction using 14 ul of the 
5 termination mixture of DNA sequencing with Sequenase (United States Biochemical Corp.). After the 
reaction mixtures were incubated at 37 'C for 15 minutes, 0.5 ul of RNase A (10 mg/ml) and 1.3 ul of 0.25 
M EDTA were added to each reaction mixture and the mixture was incubated for another 5 minutes. The 
reaction mixture was extracted with phenol, and then with chloroform. The reaction products were 
precipitated by ethanol, which were then solubilized in 6 ul of sample buffer (32% formamide, 6.7 mM 
w EDTA, 0.017% BPB and XC). The solubilized samples were heated at 95 °C for 2 minutes. The msDNA is 
separated and analyzed by a 10% sequencing gel. 

By the procedure described above, other msDNAs can be synthesized in a similar manner from an 
RNA fragment carrying the msr-msd encoding region and the RTs. 

While preferred embodiments of the present invention have been described herein, it will be understood 
75 that various changes and modifications may be made without departing from the spirit of the invention and 
these are intended to be within the scope of the claims. 
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Claims 

55 

1. An msDNA molecule which has a DNA and a RNA portion derived from a retron which contains an 
msd-msr region coding respectively for the DNA and RNA portions of the msDNA, an open reading 
frame (ORF) encoding a reverse transcriptase (RT), which comprises a single-stranded RNA covalently 
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linked to a single-stranded DNA by a 2\5'-phosphodiester bond between the 2'-OH group of an internal 
rG residue and the 5'-phosphate of the DNA strand and non-covalently linked to the DNA by 
overlapping complementary nucleotides at the 3' ends of the RNA and DNA strands, which RNA and 
DNA portions each form a secondary structure. 

The msDNA of claim 1 whose synthesis involves the transcription of a long primary RNA transcript 
beginning upstream from and including the msr region of the retron, extending to and including the 
msd region and including the ORF encoding the reverse transcriptase, the primary transcript having 
inverted repeats and folding the primary transcript thereby bringing together complementary domains 
of the inverted repeat, allowing an internal guanosine residue in the 5* end of the RNA to be accessible 
as a priming site for synthesis of the DNA portion of the msDNA the by reverse transcriptase. 

The msDNA of claim 2 whose synthesis requires an RT which is capable of synthesizing a cDNA 
strand from the RNA transcript which functions as a template and forming the branched-linkage 
between the 2'-hydroxyl group of the internal guanosine residue and the 5'-phosphate of the first 
deoxyribonucleotide of the cDNA strand. 

The msDNA of claim 3 whose synthesis further comprises removal of the RNA template within the 
growing DNA/RNA complex and termination of msDNA synthesis. 

The msDNA of claim 4 which has features as shown below 
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wherein X, Y, Z, L, S, W, V and Q are of varying nucleotide lengths. 

6. The msDNA of claim 5 wherein there is one stem-loop structure in the DNA portion and at least one 
stem-loop structure in the RNA portion. 

7. The msDNA of claim 5 which is artificial. 

8. A reverse transcriptase which is capable of synthesizing a cDNA strand from a primed RNA template, 
the synthesis being initiated with the formation of a 2\5'- linkage between the template molecule and 
the first nucleotide at the 5' end of the cDNA strand. 

9. The reverse transcriptase of claim 9 in the presence of a primer to initiate cDNA synthesis. 

10. The reverse transcriptase of claim 9 wherein the template is an RNA template. 

11. The reverse transcriptase of claim 9 wherein the template is a DNA template. 

12. The reverse transcriptase of claim 9 which contains an RNase H domain. 
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13. The reverse transcriptase of claim 13 which also contains a tether domain. 

14. The reverse transcriptase of claim 9 which does not contain an RNase H domain. 

15. A synthesis of a single-stranded msdna molecule which contains an RNA and a DNA portion which 
comprises reacting a 

(a) reverse transcriptase capable of synthesizing msDNA from an RNA transcript which contains an 
msr-msd region which codes for the RNA and DNA regions, respectively, of the msDNA with 
(bTthelour dNTPs (dTTP, dCTP, dATP and dGTP) in the presence of the RNA transcript 

16. The synthesis of claim 16 which comprises adding RNase H to remove the 5' end of the RNA 
transcript. 

17. The synthesis of claim 17 which comprises isolating the msDNA. 

18. The synthesis of claim 16 in which the reverse transcriptase synthesizes a cDNA strand from the RNA 
template, the synthesis being initiated with the formation of a 2\5'-linkage between the template 
molecule and the first nucleotide at the 5' end of the cDNA strand. 

20 19. An msDNA which comprises a single-stranded RNA covalently linked to a single-stranded DNA by a 
2\5'-phosphodiester bond between the 2'-OH group of an internal rG residue and the 5'-phosphate of 
the DNA strand, and non-covalently linked to the DNA by overlapping complementary nucleotides at 
the 3' ends of the RNA and DNA strands, which RNA and DNA portions each form a secondary 
structure, which DNA or RNA portion contains a foreign DNA or RNA sequence respectively, which 

25 msDNA is capable of being an antisense molecule with respect to an mRNA of a target protein. 
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